194,202 research outputs found

    Internet delivery of time-synchronised multimedia: the SCOTS projects

    Get PDF
    The Scottish Corpus of Texts and Speech (SCOTS) Project at Glasgow University aims to make available over the Internet a 4 million-word multimedia corpus of texts in the languages of Scotland. Twenty percent of this final total will comprise spoken language, in a combination of audio and video material. Versions of SCOTS have been accessible on the Internet since November 2004, and regular additions are made to the Corpus as texts are processed and functionality is improved. While the Corpus is a valuable resource for research, our target users also include the general public, and this has important implications for the nature of the Corpus and website. This paper will begin with a general introduction to the SCOTS Project, and in particular to the nature of our data. The main part of the paper will then present the approach taken to spoken texts. Transcriptions are made using Praat (Boersma and Weenink, University of Amsterdam), which produces a time-based transcription and allows for multiple speakers though independent tiers. This output is then processed to produce a turn-based transcription with overlap and non-linguistic noises indicated. As this transcription is synchronised with the source audio/video material it allows users direct access to any particular passage of the recording, possibly based upon a word query. This process and the end result will be demonstrated and discussed. We shall end by considering the value which is added to an Internet-delivered Corpus by these means of treating spoken text. The advantages include the possibility of returning search results from both written texts and multimedia documents; the easy location of the relevant section of the audio file; and the production through Praat of a turn-based orthographic transcription, which is accessible to a general as well as an academic user. These techniques can also be extended to other research requirements, such as the mark-up of gesture in video texts

    Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project

    Get PDF
    In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events ” (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks. Index Terms: Networked multimedia events; audio processing: speech recognition; speaker diarization and linking; multimedia indexing and searching; hyper-events. 1

    Methods and Techniques for the Access of Persons With Visual Impairments to Handbooks and Textbooks

    Get PDF
    The transformation of learning materials in an accessible format for blind students allows them to access texts and graphics. This access was not possible before considering the past technologies. The Daisy format represents an accessibility standard that permits the visually impaired person to listen an audio book as a person without disability. In this way, the visually impaired person can listen the audio book according to its content and/or pages. The usage of this format enables to develop new approaches in the process of teaching and learning, not only for visually impaired students but also for visually impaired teachers. The Daisy format proved to be efficient in the educational area also for children with learning difficulties. Studies concerned with efficiency of Daisy book.Daisy, visual impairment, access technologies, accessible audio books

    Improving the Listening Skills of Grade VIII Students of SMPN 5 Panggang, Gunungkidul in the Academic Year of 2013/2014 through Digital Media

    Get PDF
    This study was aimed at improving the listening skills of Grade VIII students of SMPN 5 Panggang, Gunungkidul in the Academic Year of 2013/2014 through digital media. This collaborative action research study was carried out in three cycles. The design consisted of: (1) planning, (2) action, (3) observation, and (4) reflection. The data of this research study consisted of qualitative and quantitative data gathered through observations, interviews/testimonies, rating, video/audio recording, and performance tests. The qualitative data were analyzed using the interactive qualitative data analysis which included: data reduction, data display, and conclusion drawing. The quantitative ones were analyzed using descriptive statistics. Efforts were done to fulfill the validity, those are: democratic validity, outcome validity, process validity, and dialogic validity. On the other hand, the reliability of the data was established through methodological triangulation, time triangulation, and investigator triangulation. The findings show that the digital media are believed to improve the listening learning processes which lead into the improvement of students’ listening skills. The process of using the digital media are done through the modified three-stage listening in the following actions: (a) using pictures to direct students’ attention to the lesson, (b) using audio/audio-visual media to do extensive listening activities, (c) using audio media to do pronunciation intensive practices, (d) using audio/audio-visual media to apply word recognition activities, and (e) using audio/audio-visual media to conduct listening comprehension activities. The results of the actions are as follows. First, pictures in the pre-listening activity are found to increase students’ interest in the lesson, direct their minds into the topic, and motivate them to learn. Second, audio media make students happy during the instruction and actively involved in the listening instructional process. Third, the audio media also enable students to: recognize the communicative functions of the texts, retain the chunks of language, distinguish among the distinctive sounds of English, recognize English stress patterns and intonation, recognize words in the given spoken texts, and comprehend the texts. Fourth, the digital stories (audio- visual media) are able to grab students’ attention, make them happy, facilitate them to focus on the monologues and comprehend them better so they are able to infer situations, participants, goals of the spoken texts

    'It's like the space shuttle blows up every day':Digital television heritage as memory of European crises in the age of information overload

    Get PDF
    Television is a public mediator of what constitutes 'crises' in Europe. Audio-visual archives and researchers are facing new complexities and 'information bubbles' when telling stories and reusing televised materials. I reflect on these practices, among others, via a comparative case analysis of the EUscreen portal offering access to thousands of items of European audio-visual heritage. I question how practices of selection and curation can support comparative interpretations of such representations. This approach aims to understand and support (1) interpretations of digitized/digital audio-visual sources in the era of information overload; (2) user interaction with digital search technologies - especially researchers as platform users; and (3) contextualization for reuse of audio-visual texts. Support for cultural memory research is crucial as television's audio-visual heritage can help us to recognize which cultural practices result in the production of specific texts in European societies, representing conditions of the multiple crises that European citizens are experiencing today

    EFFECTS OF HEADINGS ON PROCESSING OF AUDIO TEXTS

    Get PDF
    Text-to-speech devices often do a poor job of translating signals such as headings from visual into audio mode. Previous research studies have attempted to address this problem but these studies have mainly used heading detection tasks. The current study seeks to investigate 1) whether listeners find the presence of audio headings useful in natural learning tasks, and 2) the type of heading rendering that is most useful in natural learning tasks. The three learning tasks in this study include note-taking, cued recall, and knowledge transfer. Results from this study reveal that listeners find audio headings useful in the note-taking task. It is less clear how audio headings affect cued recall and knowledge transfer, but there is some evidence that a rendering strategy which conveys audio contrast plus other types of signaling information seems to facilitate cued recall performance

    A Tool to Solve Sentence Segmentation Problem on Preparing Speech Database for Indonesian Text-to-speech System

    Get PDF
    AbstractCreating a training data ready to be used for developing a text-to-speech (TTS) system can be a difficult task, since sometimes the recorded audio data is not the same with the prepared texts. To overcome differences between audio and text data, we developed a tool to segment audio data into sentences. As it is known, doing sentence segmentation of audio data manually needs efforts and resources. This paper presents a solution for alleviating problems encountered during segmentation process of audio data for developing an Indonesian TTS system. The tool was developed based on a fact that bahasa Indonesia is a syllable-timed language. We found that our tool reduces resources needed for segmenting Indonesian audio data

    Uncovering Scientific and Multimodal Literacy through Audio Description

    Get PDF
    Today’s scientific texts are complex and multimodal. Due to new technology, the number of images is increasing, as is their diversity and complexity. Interaction with complex texts and visualisations becomes a challenge. How can we help readers and learners achieve multimodal literacy? We use data from the audio description of a popular scientific journal and think-aloud protocols to uncover knowledge and competences necessary for reading and understanding multimodal scientific texts. Four issues of the printed journal were analysed. The aural version of the journal was compared with the printed version to show how the semiotic interplay has been presented for the users. Additional meaning-making activities have been identified from the think-aloud protocol. As a result, we could reveal how the audio describer combined the contents of the available resources, made judgements about relevant information, determined ways of verbalising visual information, used conceptual knowledge, filled in the gaps missing in the interplay of the resources, and reordered information for optimal flow and understanding. We argue that the meaning-making activities identified through audio description and think-aloud protocols can be incorporated into instruction in educational contexts and can thereby improve readers’ competencies for reading and understanding multimodal scientific texts

    Extraordinary Everyday Stories: Audio Resources for the Communication Instructor

    Get PDF
    Communication instructors often supplement course texts with artistic works such as feature films, short stories, and memoirs. A less common form of supplementary material is the audio documentary/story. The discussion below introduces several audio resources likely to help students deepen their understanding of communication in general and interpersonal and intercultural communication in particular. I also offer a few ideas to those instructors wishing to help students create their own small-scale audio productions
    • …
    corecore